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SEQUENTIAL MONITORING WITH CONDITIONAL 
RANDOMIZATION TESTS 

By Victoria Plamadeala and William F. Rosenberger 1 

Precision Therapeutics and George Mason University 

Sequential monitoring in clinical trials is often employed to al- 
low for early stopping and other interim decisions, while maintain- 
ing the type I error rate. However, sequential monitoring is typi- 
cally described only in the context of a population model. We de- 
scribe a computational method to implement sequential monitoring 
in a randomization-based context. In particular, we discuss a new 
technique for the computation of approximate conditional tests fol- 
lowing restricted randomization procedures and then apply this tech- 
nique to approximate the joint distribution of sequentially computed 
conditional randomization tests. We also describe the computation 
of a randomization-based analog of the information fraction. We ap- 
ply these techniques to a restricted randomization procedure, Efron's 
[Biometmka 58 (1971) 403-417] biased coin design. These techniques 
require derivation of certain conditional probabilities and conditional 
covariances of the randomization procedure. We employ combinatoric 
techniques to derive these for the biased coin design. 

1. Introduction. Sequential monitoring refers to analyzing data periodi- 
cally during the course of a clinical trial, with the purpose of detecting early 
evidence in support of or against a hypothesis. A desirable feature of such 
a monitoring plan would be flexible inspections of the data that can occur 
at arbitrary time points. At the same time, sequentially tested hypotheses 
must maintain the overall probability of type I error at the prespecified level, 
since repeated testing is known to inflate it. The Lan and DeMets (1983) 
error spending approach for sequential monitoring allows this. The approach 
makes use of a type I error spending function, which depends on the amount 
of "statistical information" available at the time of the interim inspection. In 
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the context of sequential monitoring, the statistical information is a measure 
of how far a trial has progressed. Under a population model, the amount 
of interim information — the information fraction — is defined as the propor- 
tion of Fisher's information observed thus far in the trial. The type I error 
spending function rations the amount of type I error that may be spent at 
each look commensurate to the information fraction. The critical value asso- 
ciated with the allowable probability of type I error at a certain interim look 
is obtained and compared to the observed value of the statistic. The decision 
whether to continue, or stop, the trial is based on this comparison. Sequen- 
tial monitoring is typically discussed in the context of a population model. 
However, it is not uncommon for the FDA to require a "re-analysis" of data 
using a "re-randomization test," or, as we call it here, a randomization test, 
defined below. 

Let T = Ti, . . . ,T n be a randomization sequence, where Tj = 1 if sub- 
ject i is randomized to treatment 1; Tj = if subject i is randomized to 
treatment 2, i = 1, ...,n. Let N\(j) = X^=i^i be the number of subjects 
randomized to treatment 1 after j assignments. Let X = (X%, . . . ,X n ) be 
the responses based on some primary outcome variable, and let x be the 
realization. A valid test of the treatment effect can be conducted permut- 
ing T in all possible ways [e.g., Lehmann (1986), Chapter 5]. However, 
if one wishes to incorporate the randomized design into the analysis, un- 
der restricted randomization, such permutations are not equiprobable [e.g., 
Rosenberger and Lachin (2002), Chapter 7]. The family of linear rank tests 
provides a large class of test statistics with which to conduct randomiza- 
tion tests. The form of the statistic is V(T) = a^T, for a score vector 
a n = {o-in — o-n, ■ ■ ■ , a nn — o, n )' \ where cij n is some function of the rank of the 
jth observation and a n = Y^j=i a jn/ n - The p- value of the randomization test 
is computed with respect to a reference set of sequences. The unconditional 
reference set contains all possible allocation sequences, including those that 
give little or no information about the treatment effect (e.g., 1, 1, . . . , 1). Also, 
the random numbers on each treatment arm, Ni(n) and A^ra), are ancil- 
lary statistics, and therefore the conditional reference set is preferred, which 
finds probabilities conditional on Ni(n) = n\, that is, the observed number 
on treatment 1 [e.g., Cox (1982), Berger (2000)]. This leads to a conditional 
test. 

The literature is largely silent on the subject of sequential monitoring of 
randomization tests (brief exceptions are found in Rosenberger and Lachin 
[(2002), Section 7.10] and Zhang and Rosenberger (2008), whose techniques 
only extend to one interim inspection) . The computation of conditional ran- 
domization tests is also inherently difficult, even without sequential monitor- 
ing. We address these issues in this paper by proposing a technique, based on 
deriving exact conditional distributions of randomization procedures, that 
leads to a simple computational method for approximating the distribution 
of sequentially computed randomization tests. We also discuss the appro- 
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priate analog for "information fraction" in the context of a randomization 
model. Our focus will be on one particular restricted randomization proce- 
dure, Efron's (1971) biased coin design, which induces a beautiful closed- 
form combinatoric structure that facilitates such an analysis. However, the 
technique can be applied to any randomization procedure for which we can 
determine certain exact conditional distributional results. 
Let <j>j+i t> e a restricted randomization procedure such that 

<j, j+l =Vr{T j+l = l\N 1 {j)). 

Efron's (1971) biased coin design is a restricted randomization procedure 
for clinical trials that has exceptional properties: it balances treatment as- 
signments throughout the course of the trial with low variability [e.g., An- 
tognini (2008)], and it mitigates selection and accidental biases [Rosenberger 
and Lachin (2002)]. Then the biased coin design BCD(p) for a parameter 
p € [1/2, 1], q = 1 — p, is defined as 

r 1/2, whenN 1 (j)=j/2, 
(1.1) (/) j+1 =lp, when N 1 (j)<j/2,j = 0,1,2,..., 

(q, when N^j) > j/2. 

Note that p = 0.5 results in complete randomization and p = 1 results in per- 
muted blocks with block size 2. When p < 1, the design is fully randomized, 
in that each subject will be assigned to treatment randomly, which differs 
markedly from the permuted block design, where some subjects in the tail 
of each block are assigned with probability 1. Let Dj = 2N\{j) — j be the 
difference in numbers assigned to treatments 1 and 2; {|Z) n |}^ =1 forms an 
asymmetric random walk when p£ (1/2,1). Markaryan and Rosenberger 
(2010) derive the exact distribution of Dj for the BCD(p), from which the 
exact distribution of N\ (n) follows immediately: 

P(N 1 (n)=n 1 ) 

(1-2) 

„mV n-2Z / ni + Aj 

p h n+2l[ * 1 > 

p ni sr^ n — ni — I (n — + l 
_< 2 2-^ n- n\ +1 V I 

p^_Y?ni-Z (ni + l\ 2 , 

2 ^ m + ZV 1 J q 

K 1=0 1 v 7 

Their paper also provides the exact expression for the variance-covariance 
matrix of the treatment assignments T. 

In this paper we provide the exact conditional distribution of iVi (n) given 
l<j<n, and the expression for the variance-covariance matrix of T 



ni 



2' 



,n- 2n\+l — 1 



0<nx< -, 



i\— n+l— 1 



< ni < n. 
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given Ni(n), Si ni . While these are heretofore unknown results on theoret- 
ical properties of a random walk, our primary interest is that these results 
give us a computational method to approximate conditional randomization 
tests following the BCD(p). We then extend these results to the case where 
sequential analysis is implemented in the course of a clinical trial. 

Rosenberger and Lachin (2002) distinguish among three techniques that 
can be used to compute randomization tests: exact, Monte Carlo and asymp- 
totic. Exact tests are computationally intensive, even with today's comput- 
ing, and require networking algorithms [Mehta, Patel and Wei (1988)]. Hol- 
lander and Peha (1988) developed a clever recursive algorithm to determine 
the exact distribution of both conditional and unconditional randomization 
tests following Efron's biased coin design and applied it to a sample of size of 
n = 37. It can be assumed that such computational techniques would be able 
to solve much larger problems with today's computing resources. While au- 
thors have determined the asymptotic normality of conditional randomiza- 
tion tests under various score functions and randomization procedures [e.g., 
Smythe (1988)], Efron's biased coin induces a stationary distribution, and 
hence the test statistic may not be asymptotically normal. This phenomenon 
was noted in a number of papers, first by counterexample in Smythe and 
Wei (1983) for the unconditional test, and then by simulation by Hollander 
and Peha (1988) for the conditional test. 

Mehta, Patel and Senchaudhuri (1988) use importance sampling to esti- 
mate the conditional randomization test's p-value; their technique employs 
an elegant, but complex, networking algorithm. The efficiency of the esti- 
mator relies on the convergence to normality of the test statistic, which may 
not hold under the biased coin design. One might be able to modify the 
network algorithm in Mehta, Patel and Wei (1988) or the recursive algo- 
rithm in Hollander and Peha (1988) to compute the exact distribution of 
sequentially monitored conditional randomization tests, but here we provide 
a method that is not very computationally intensive and allows us to sample 
directly from the conditional reference set under a broad class of restricted 
randomization procedures. 

The paper is organized as follows. In Section 2, we present a method for 
sampling directly from the conditional distribution of V(T), which facilitates 
the computation of conditional tests. We need to compute certain exact con- 
ditional probabilities to apply this method, and we do this for Efron's biased 
coin design. We extend this application to develop a computational technique 
for sequential monitoring of conditional randomization tests in Section 3. In 
Section 4, we describe the analog of "information" in the context of a ran- 
domization model. In defining a randomization-based information fraction, 
we must derive the conditional variance-covariance matrix of T, which we 
do for Efron's biased coin design. We draw conclusions in Section 5. All the 
major proofs, some of which require careful combinatorics, are relegated to 
the online supplement. 
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2. Computation of conditional randomization tests. 

2.1. Generating sequences from the unconditional reference set. Suppose 
a total of n subjects are randomized to two treatments. Let n\ be the ob- 
served number of assignments on treatment 1. A conditional randomization 
test can be approximated by sampling sufficiently many sequences from 
the conditional reference set, f2 c , the collection of sequences that satisfy 
N\(n) = n\. This can be achieved by generating sequences from the uncon- 
ditional reference set, the set of all possible assignments, and retaining those 
that belong to Q c . 

Suppose at least N c number of sequences that satisfy At (n) = n\ are suf- 
ficient to approximate the conditional randomization distribution of V^(T). 
Let K sequences be sampled, Ti, . . . , Tr-, independently and with replace- 
ment from the unconditional reference using (f>j+i as the sampling mecha- 
nism. This number of Monte Carlo sequences must be large enough such 
that at least N c sequences satisfy the condition iVi(n) = n\. The requi- 
site number of sequences, K, follows a negative binomial random variable 
with parameters ir = P(Ni(n) = n\) and r = N c [Zhang and Rosenberger 

(2011)]. Let N denote a value in the range of K, N = N c , N c + 1, For 

k = 1, . . . , N , a sequence = t is sampled from the unconditional reference 
set with probability 

n-l 

(2.i) /(t) = (1/2) n^ + i)^ +i (i - w tj+i . 

where tj+i are the observed values of TV+i- The jth sampled sequence in- 
duces two Bernoulli random variables 



1, if iVi(n) =m, 
0, otherwise, 



and 



x = h, if N 1 (n) = n 1 and V(Tj )>«*, 
3 \ 0, otherwise, 

where v* is the observed value of the statistic. A strongly consistent esti- 
mator for the p-value of the upper-tailed conditional test can be computed 

as 



N 
J = 

T N Y 



Table 1 reports the 95th percentile of K when sampling from the uncon- 
ditional reference set and N c = 2500 for Efron's biased coin design. These 
sample sizes are reasonable when there is perfect balance in the assignments, 
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Table 1 

Approximate 95th percentile of K for various n, n\, N c = 2500 



n 


rei = 0.45n 




ni = 0.48n 


m = 0.50n 






BCD(p = 


2/3) 




100 


3,531,344 




55,060 


5117 


200 


3,611,280,266 




881,557 


5117 


500 


3,877,310 x 10 12 




3,611,026,232 


5117 






BCD(p = 


3/4) 




100 


114,384,212 




156,865 


3822 


200 


6,754,269 x 10 6 




12,709,307 


3822 


500 


1,390,644 x 10 21 




6,754,269 x 10 6 


3822 



but increase considerably in the presence of imbalance. This technique can- 
not be used in the presence of any imbalance. 

2.2. Our method: Generating sequences from the conditional reference set. 
Rather than sampling too many sequences and discarding those that do 
not satisfy the condition N\(n) = m, it is more efficient to sample directly 
from Q c — the collection of all randomization sequences that satisfy the con- 
dition N\(n) = n\. The set fi c will be called the conditional reference set. 
Let N c randomization sequences, Ti,...,Tjv c , be sampled independently 
and with replacement strictly from Q c , each with respective probabilities 
h(ti), . . . , h(tN c ). For an upper-tailed test, the kih sampled sequence in- 
duces a Bernoulli random variable 



(2.3) V k 



1, if V(T k ) > v* 
0, otherwise. 



The Monte Carlo estimator of the upper-tailed test's p-value is the strongly 
consistent and unbiased estimator V = ^2 k =i Vk/N c . (It may be possible to 
find an estimator with a smaller variance, but we do not address the issue 
of estimation of p- values in this paper.) 

To guarantee a sequence from Q c , Tj + \ in <^j+i must be conditioned on 
both N\{j) and N\(n). Consequently, for < rrij < j, the procedure 

f P(T j+1 = l\N 1 (j)=m j ,N 1 (n) = n 1 ), l<j<n-l, 
\P(T j+1 = l\N 1 (n)=n 1 ), j = 0, 

must be applied to generate a random sequence strictly from O c . We now 
provide a general formula relating the conditional and the unconditional 
reference sets, which facilitates the generation of sequences from the condi- 
tional reference set for any restricted randomization procedure of the form 
i+ i = Pr(r i+1 = l|J\Ti(y))- 



(2-4) Pj+ i 
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Theorem 2.1. For n= 1,2,3, . . . ,0 < m < n, < j < n, < rrij < j and 
i)j + i(mj) = P(Tj + i = l|iVi(j) = mj), the rule 



(2.5) Pj+1 



^i+i( m i) 



P(N 1 (n)=n 1 \N 1 (j + 1) = rrij + 1) 
P{N 1 {n) = n l \N 1 {j)=m J ) ' 

1 <j < n- 1, 



, P(iV 1 (n)=n 1 |r j+1 = l) . 

ffljil m,' ; — ; , 7 = U, 

^ J+n ]> P(N 1 (n)=n 1 ) ' J 



can be used to sample a sequence that satisfies Ni(n) =n\. 

Proof. The result follows from an application of Bayes theorem to (2.4) 
and the Markovian property of N\. □ 

Furthermore, for k = 1, . .. , N c , a sequence = t is sampled from f2 c 
with probability 

(2.6) h(t) = n(Pi+i)* +1 (l - Pj+1 )^+K 

j=0 

In the simplest case, complete randomization, Pj+\ = {n\ — mj) /{n — j), < 
j < n— 1, and this is the random allocation rule [see Rosenberger and Lachin 
(2002)], which is sometimes used to fill permuted blocks. 

The following theorem gives these probabilities for Efron's biased coin 
design. The distribution of Ni(n) given Ni(j) = mj, < mj < j, has three 
cases depending on the value of mj with respect to j, 1 < j < n. Within each 
case, P(Ni(n) = ni\Ni(j) = mj) depends the value of n\ with respect to n, 
j and rrij. 

Theorem 2.2. Let n = 2, 3, 4, . . . , 1 < j < n, < rrij < j and nij <ni< 
n — j + mj . Denote 

C(x,l):=^]( XJ r l ) and D := ( ^ )-( U 7 [ 

x + l \ I ) \m-mjj \ n\ — j + mj 

For the BCD(p): 

(1) When 0<mj < j/2, P(N 1 (n) = ni|iVi(j) = mj) is 

n-y-Lj ) P ni - m Jq n - J - ni+m > if mj <n x <j- m i% 

ni+nij-j 

0.5p ni - m i C(n- ni -mj,l)q n - 2ni - 1+l 
1=0 
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n—j—m+mj 

p ni- mj J- C{ ni -mj,l)q l ifnx = n/2, 
1=0 

n—j—ni+mj 

2ni — n— 1+2 



if n/2 < n\ < n — j + rrij . 



2=0 



(2) When m j =j/2, 

P(N 1 (n)=n 1 \N 1 (j) = m j ) 

= P{N\ (n — j) = m — rrij ) , rrij < nx < n — j + TTij , 

where the unconditional distribution is derived in Markaryan and Rosen- 
berg er (2010). 

(3) When j/2 < rrij < j, P(Ni(n) = TiilA^iO') = rrij) is 

ni—rrij 

0.5p ni+m i- j Cin-j-nx + mj,^ 71 - 2 ^-^ 1 if rrij <m <n/2, 

2=0 

ni — rrij 

^ n _ i _ ni+m . ^ <7(ra-j -m +mj,l)q l if m = n/2, 

2=0 



„2rai-n-l+2 

2=0 

+ Dp n -i- ni + m iq ni - m i if n/2 <n x <n-mj, 
n ^ ) p n -3- n i+ m i q n i- m i if n — mj < ni < n _ j _|_ m ^ . 

]_ 7TZ j J 

Proof. See Appendix A in the supplementary material [Plamadeala 
and Rosenberger (2011)]. □ 

Note that if n = j and n\ = rrij, P[N\{n) = n\\N\(j) = rrij) = 1, and if 
rrij > n\ or n — j < ri\ — rrij, P(Ni(n) = ni\N\(j) = rrij) = 0. Also, P(N\(n) = 
ni|iVi(0) = 0) = P(iVi(n) = ni). 

The procedure then follows by simply generating N c sequences using (2.5). 
This allows us to reduce the magnitude of the problem from the astro- 
nomical numbers in Table 1 to just N c . A satisfactory value for N c can 
be obtained from the constraint MSE(F) =p c {l - p c )/N c < l/AN c < e. For 
e = 0.0001, N c > 2500. Higher precision in estimation is possible by find- 
ing N c that ensures P(\V — p c \ < 0.1p c ) = 0.99, for instance. It follows that 
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Table 2 

Approximations for the upper 0.1 tail of the randomization distribution of the linear rank 
statistic by sampling from Q. c ; N c — 2500, BCD(0.6) 





n 


m 


Exact 


1000 Monte Carlo runs; mean (SD) 


P(V(T) > 21.5) 


30 


15 


0.1057 


0.1053 (0.0061) 


P(V(T) > 23) 


30 


12 


0.1009 


0.1008 (0.0059) 


P(V(T) > 31) 


40 


20 


0.1011 


0.1009 (0.0061) 


P(V(T) > 34) 


40 


16 


0.1000 


0.0997 (0.0060) 


P(V(T) > 82) 


100 


50 




0.1055 (0.0060) 


P(V(T) > 113) 


100 


40 




0.1043 (0.0062) 


P(V(T) > 299) 


500 


250 




0.1104 (0.0063) 


P(V(T) > 1000) 


500 


200 




0.1030 (0.0058) 



N c w (2.576/0. 1) 2 (1 —p c )/Pc- Thus, to estimate a p-value as large as 0.04 
with an error of 10% of 0.04 with 0.99 probability, the Monte Carlo sample 
size must be N c = 15,924. If a smaller p-value is expected, N c will be larger. 

Table 2 provides approximations for the upper 0.1 tail of the linear rank 
statistic with simple rank scores under the BCD (0.6) randomization. For 
small samples sizes, we also provide the exact p- value for comparison pur- 
poses; Monte Carlo estimates are very close with small variability. As ex- 
pected, the variability of the estimates does not change across different sam- 
ple sizes n. The computational complexity of the sampling scheme for the 
BCD is invariant to the value of p. Comparing Tables 1 and 2, the condi- 
tional distribution method reduces the Monte Carlo sample size to a few 
thousand. 

Following stratification on known covariates, the computation of a strat- 
ified linear rank test based on the conditional randomization distribution is 
straightforward by summing the stratum-specific linear-rank test statistics 
over / independent strata. Using the methodology described in this sec- 
tion, a sequence is sampled independently from the conditional reference 
of each stratum; the linear-rank statistic is evaluated in each stratum and 
the stratum-specific test statistics are summed. The process is repeated N c 
times, and the stratified test's p-value is estimated by the proportion of 
summed statistics as or more extreme than the one observed. 

3. Extension to sequential monitoring. Suppose there are L — 1 interim 
inspections of the data after 1 < r\ < T2 < ■ ■ ■ < r^-i <r^ = n patients re- 
sponded. Let < t\ < t2 < • ■ ■ < th-\ < tL = 1 be the corresponding informa- 
tion fraction at those inspections. For conditional tests, let Aq(ri),iVi(r2), . . . , 
Ni(rL-i), and Ni(vl) = Ni(n) be the sample sizes randomized to treat- 
ment 1 after inspections 1,...,L and let 77,11, • • • > ^i(£-i)> and n\i, = n\ be 
realizations of these sample sizes. Let the linear-rank randomization test 
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statistic computed at each of the inspections be given by V r[ = Y^=i( a jri — 
a ri )Tj = a'. ; T( n ),/ = 1, . . . ,L. Using the alpha-spending function approach 
[Lan and DeMets (1983)], let a*(t),t £ [0,1], be a nondecreasing function 
such that a*(0) = and a*(l) = a, the significance level of the one-sided test. 
One such function is a* (t) = 2 - 2<S>(z a/2 /Vi),0 < t < 1; a*(0) = 0, where 
$ is the standard normal distribution function and z a / 2 = < 1 > ~ 1 (1 — a/2) 
[Lan and DeMets (1983), O'Brien and Fleming (1979)]. Following Zhang 
and Rosenberger (2008), the upper-tailed, conditional randomization test 
with L interim looks involves finding di, . . . , di such that 



(3.1) 



( P(V ri >d 1 \N 1 (r 1 ) = n 11 ) = a*(t 1 ), 
P(V ri <d 1 ,V r2 >d 2 \N 1 {r 1 )=n 11 ,N 1 {r 2 ) 

P [ V n < dl , V r2 < d 2 , V r3 >d 3 | f| JVi ( rj ] 
V 3=1 



P[V ri <d 1 ,...,V L >d L \f]N 1 (r j ) 



nu) = a*(t 2 ) - a*(ti), 
: mA = a*(t 3 ) - a*(t 2 ), 

= a - a*(iL_i). 



3=1 



The asymptotic joint normality of these conditional distributions has not 
been shown, except in the case of L = 2 under the generalized biased coin 
design [Zhang and Rosenberger (2008)]. 

We express (3.1) in terms of univariate conditional distributions, which 
are much easier to sample from than the joint distributions in (3.1). 



Lemma 3.1. The set of conditions (3.1) is equivalent to 
{ P(V ri >d 1 \N 1 (r 1 ) = n n )=a*(t 1 ), 

a*(< 2 )-<**(ti) 



(3.2) 



P\V r2 > d 2 \V ri < d u PltNifa) = n y } 



3=1 



P[v r , > d 3 \ f|{F r . < d 3 }, f|{iVi(^) = n Xj } J = ^ 

V 3=1 3=1 / 



l-a*(ti) ' 
a*(t 3 )-a*(t 2 ) 



a*(*2) 



V 3=1 3=1 / 



a-a*(ti,_i) 



a*(t 



L-l; 



Proof. See Appendix B in the supplementary material [Plamadeala 
and Rosenberger (2011)]. □ 
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At each inspection I in (3.2), the conditional reference set is the collection 
of all sequences satisfying n?,=i{-^i( r j) = n u}- The following theorem can 
be used to sample sequences from such sets. 



Theorem 3.1. Let 1 < / < L, ro, n,r2, . . . ,77 and nio, ran, . . . , nn be de- 
fined as before, with ro = and raio = 0. Let k = 1, . . . , I. For r k -i < j < r k , 
rai(fc-i) < Trij < j and (ftj+i(m,j) = P{Tj + \ = l\N\(j) = rrij), the rule 

,,on / a, , P(Ni(rk) = nikWiU + 1) = rrij + 1) 
(3.3) Vj+i = 9j+i( m j) r> lAT ( \ 1 at ( -\ \ 

can be used to sample a sequence that satisfies n!=i{-^i( r i) = n u}- 

Proof. See Appendix C in the supplementary material [Plamadeala 
and Rosenberger (2011)]. □ 

Note that equation (3.3) reduces succinctly to the expected V'j+i = (rai^ — 
m j)/i r k — j) for complete randomization, I = 1, .. . ,L, k = 1, . . . ,1, ru-i < 
j < r k and nu/,_i\ < rrij < j. For the BCD(d) the numerator and the de- 
nominator of V'j'+i must be evaluated according to Theorem 2.2. To obtain 
a sequence from the reference set satisfying n!=i{-^i( r «) =n ii}> the sam- 
pling must be done in k = 1, . . . , I steps as follows: 

(1) At stage k = 1, apply tpj+i with ro < j < r\ to sample the first r\ 
assignments. 

(2) At stage k = 2, apply ipj+i with r±< j < r2 to sample the next ri — r\ 
assignments. 

(3) At stage 3 < k < I, apply ipj+i with r£_i < j <r k to sample the next 
rfc — rfc_i assignments. 

Suppose a sample of size N c (sequences) is sufficient to estimate a distri- 
bution quantile using some quantile estimator. The Monte Carlo algorithm 
that estimates the boundary d\,...,di for an a-level, upper-tailed, condi- 
tional randomization test with L — 1 interim inspections is as follows: 

(1) At stage 1, generate N c randomization sequences of n assignments 
from the reference set satisfying N\{r\) = ran. Evaluate V ri for each se- 
quence; estimate d\ using the nonparametric quantile estimator of Chen 
and Lazar (2010) based on the values of V ri . 

(2) At stage 2, generate N c /(1 — a*(ti)) randomization sequences of T2 
assignments from the reference set satisfying H l ! =i{-^i( r i) = n u}- For each 
sequence, evaluate V ri using the first r\ of r^ assignments only. Retain those 
sequences that satisfy {V ri < d{\. Evaluate V r2 for each retained sequence. 
Estimate cfo using the quantile estimator of Chen and Lazar (2010) based 
on the values of V r2 . 
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(3) At stage 3 < I < L, generate JVc/IlfciC 1 - [a*(U) - a*(ti_i)]/[l - 
a*(ti-i)]) randomization sequences of 77 assignments from the reference set 
satisfying f] l i=1 {Ni(ri) = nu}. Note that a*(to) = and a*(ti) = a. For 
each sequence, evaluate V ri , V r2 , . . . , V n _ 1 using the first n, r-i, ■ ■ ■ , n_i as- 
signments, respectively. Retain those sequences that satisfy n!=i{^i ^ ^»}- 
Evaluate V n for each retained sequence. Estimate di using the quantile es- 
timator of Chen and Lazar (2010) based on the values of V rr 

Requiring that -/Vc/IIi=i(l — \a*{ti) — Q*(tj_i)]/[1 — a*(ij_i)]) randomiza- 
tion sequences be sampled at stage I simply ensures that at least N c se- 
quences are used for the estimation of d[ at each stage /. 

4. Randomization-based information. Fisher's information is defined un- 
der a population model, and hence it is not defined in the context of randomi- 
zation-based inference. However, since the Fisher's information approxi- 
mates the inverse of the asymptotic variance of the test, it seems reasonable 
to define the randomization-based analog of information as the ratio of the 
variances [Rosenberger and Lachin (2002)]. 

where Si rj = Var(T( n )|iVi(ri) = nu, . . . , N\(r{) = nu). This requires specifi- 
cation of Si r; and We now derive these for Efron's biased coin design. 
We begin with three lemmas: 

Lemma 4.1. Let n = 2,3, . . . and0< n\ < n. Let(pi(a) = P(Ti = l\Ni(i — 
1) = a) and ff^ ] = P{Ni (j - 1) = &|JVi(i) = a + 1). For 1 < i < j < n, 

E(T i T j \N 1 (n)=n 1 ) 

_ gig M*)p(Ni(j -i)=«) ECa+i mf^ ] f^ 1] 

P(N 1 (n)=n 1 ) 

The conditional probabilities f^^jp and fn,nt are 9i ven by Theorem 2.2. 

Proof. The result follows from an application of Bayes theorem to 
P(Ti = l,Tj = l\Ni(n) = ni) and the Markovian property of N±. □ 

Given that we observe Ni(n) = n\, we now derive the variance-covariance 
matrix of T, denoted by Si m . 

Lemma 4.2. Let n = 1,2, . . . ,0 < n\ < n, i? i i ni = E(Ti\N\(n) =n\) and 
4>i(a) = P(Ti = l\Nx(i - 1) = a). For the BCD(p) 

^i-J P(N 1 (i-l) = a) ( t )l (a)P(N l (n)=n 1 \N l (i) = a + l) 
^ P(N 1 (n) = n 1 ) 
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= l/2P(iV 1 (n) = m|iVi(l) = 1)/P(#i(n) = m). 
Ifi<j, the (i,j)th entry o/£| ni is 

P(N 1 (n) = n 1 ) ^Im^ilm- 
Ifi = j, the (i,j)th entry o/£| ni is 

Cjj = i?t|m (1 — ^i|m ) • 

Proof. The result follows from an application of Bayes theorem to 
P(Tj = l|iVi(n) = m), the Markovian property of N\ and Lemma 4.1. □ 

Lemma 4.3. Let 1 < / < L, ro, ri,7"2, • • • , n and nio,7iii,. .. be de- 
fined as before, with ro = n-io = 0. Lei ^(a) = P(Tj = l|iVi(z — 1) = a), k = 

I,..., I, and f-lX^' ni(k ~ 1)) = P(Nt(i - 1) = a|iVi(r fc _i) = n 1(fc _ 1) ). Denote 

ih = E{Ti\ n l %= i{Nx(r q ) = n lq }) and \ ij{n = ^{^(r,) = n lg }). 

For 1 <k <l, rk_i < i < r^, 

sr^i-l i ( w(^-i^i(fe-i)) f (i,a+l) 

For 1 < k < I and r^-i < i < j < r^, 

< A.foU (rfc ~ 1 ' ni( *" 1)) V i ~ 1 i.m f (',«+l) f (j,W) 
Z^a=ni (fe _i) r»WJi-l,a 2^b=n 1(fc _ 1)+1 <0 W./ j — 1,6 Jr k ,n lk 

~ Ark-i,ni( k -i)) 
For all other 



Kj\r t 



E^ftiNx (r q ) = n lq }^j E (tj \f]{N 1 (r q ) = n lq }^j 



The probabilities f[\l^ k 1]) , fjflyP, frj?r$ and fr^nlC^ 1)} 

are given by Theorem 2.2. 

Proof. See Appendix D in the supplementary material [Plamadeala 
and Rosenberger (2011)]. □ 

Finally, the closed form of X| n is given in the following theorem, which 
follows immediately from Lemma 4.3: 

Theorem 4.1. Let 1 < / < L, k = l,...,l, ro,ri,r2, ■ • ■ ,77 and n^, 
nn, ■ ■ ■ be defined as before, with ro = nio = 0. 
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Table 3 

Mean (SD) of simulated a for an a = 0.05 upper tail sequential test over a Monte Carlo 
sample size of 1000, N c = 2500, interpolating the unknown observations by sampling with 

replacement 



Look I 








di 


a 


Look 1 


250 




126 0.3617 0.0011 


1709 




Look 2 


300 




148 0.6248 0.0121 


1688 




Look 3 


350 




174 1 0.0373 


1501 


0.0495 (0.0043) 


$ OE 


*(tj)-«*(*i. 


-i) 








OLl = - 


l-a*(t,_! 


) • 








The 


(i,j)th entry 


o/S| n under the BCD(p) is 












■i ~ ^i\n^j\rp ifi<j and r k . 


-l<i<j 


< r k , 




o-ij = | 


i 


(l-#i|n)> */« = i> 










lo, 


otherwise, 






where 


$i\n and 


are given by Lemma 4-3. 







Although one can compute Si n and S| r; exactly using Theorem 4.1, 
in (4.1) remains unknown at each interim inspection, since a portion of the 
data is unobserved. One would have to interpolate sequentially the remaining 
unknown data points in order to have a value for a' n and an approximation 
for (4.1). Interpolating the unknown observations by sampling with replace- 
ment the known observations is one way to obtain a value for a^. In our sim- 
ulations with data generated from two normal distributions, L = 3, n = 350, 
n\ = 174 and assignments following the BCD(3/4), the approximate infor- 
mation fraction at the first interim look with n = 250 and nn = 126 was 
0.3791, compared to the true information of 0.3759. At the second interim 
look with r2 = 300 and ni2 = 148, the approximate information fraction was 
0.6380, compared to the true information of 0.6382. 

We also simulate the probability of type I error in an example. For this 
purpose, we generate a sample of n = 350 observations from A^(l,0.9) and 
simulate treatment assignments from BCD(p = 3/4). We plan L = 3 interim 
looks: at r\ = 250, r2 = 300 and r% = 350. The observed number assigned 
to treatment 1 at each look was nu = 126, n\i = 128 and 71-13 = 174. We 
compute the boundary values using the algorithm in Section 3. Table 3 
gives the estimated type I error rate (a) and standard deviation over 1000 
replications for this sequential conditional test. The probability of type I 
error is preserved with low variability. 

5. Conclusions. We have provided a computational method to approx- 
imate conditional randomization tests, which can be extended to clinical 
trials that incorporate sequential monitoring. The key is to determine cer- 
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tain conditional probabilities from the particular randomization procedure. 
These techniques apply to any restricted randomization procedure of the 
form (f)j+i = Pr(T J+ i = l\Ni(j)) and for which closed form conditional prob- 
abilities can be obtained. We have derived the exact conditional distribution 
of Ni(n), given N\(j), for Efron's BCD(p) using combinatoric arguments, 
also the conditional variance-covariance matrix of T, which allows compu- 
tation of the information fraction. 

The class of generalized biased coin designs (GBCD) [Wei (1978)] does not 
have a known form for the exact conditional distribution, and this remains 
an open problem. For the sequential monitoring of conditional tests using 
the GBCD with one interim look, Zhang and Rosenberger (2008) derived 
the joint asymptotic distribution of the interim and the final test statistics, 
which allows for an asymptotic test. 

Acknowledgments. The authors thank Tigran Markaryan, Anindya Roy 
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SUPPLEMENTARY MATERIAL 

Supplement to "Sequential monitoring with conditional randomization 

tests" (DOI: 10.1214/11-AOS941SUPP; .pdf). The supplement contains Ap- 
pendix A (proof of Theorem 2.2), Appendix B (proof of Lemma 3.1), Ap- 
pendix C (proof of Theorem 3.1), and Appendix D (proof of Lemma 4.3). 
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